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Abstract. This work develops Monte Carlo Euler adaptive time stepping methods for the weak ap- 
proximation problem of jump diffusion driven stochastic differential equations. The main result is the 
derivation of a new expansion for the computational error, with computable leading order term in a 
posteriori form, based on stochastic flows and discrete dual backward problems which extends the re- 
sults in ]34[ . These expansions lead to efficient and accurate computation of error estimates. Adaptive 
algorithms for either stochastic time steps or quasi-deterministic time steps are described. Numerical 
examples show the performance of the proposed error approximation and of the described adaptive 
time-stepping methods. 
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1. Introduction 

This work develops adaptive methods and proves a posteriori error expansions, with computable leading 
order term, for weak approximation of jump-diffusions driven stochastic differential equations. 

1.1. Problem's setting. Consider X = {X{t) = {X^(t), . . . , X'^{t)) : t e [0,r]}, a d-dimcnsional sto- 
chastic process that is the solution of the stochastic differential equation 



(1.1) 



X{t)^X{{))+j a{s,X{s-))ds + J2 b\s,X{s-))dW^{s) 
c( s, X{s~), z)p{ds, dz), 



on a time interval [0,T] (see III. 2c in or, [0). The randomness in the equation is generated 
by (i) an J-'o—measuiahle d-dimensional random variable -'^(O), (ii) a standard £o-dimensional Wiener 
process W = {W{t) = {W^{t), . . . ,W^^'{t)): t e [0,T]}, i.e. its coordinates are independent standard real 
valued Wiener processes, and (iii) a Poisson random measure p on [0, T] x Z, where Z = R^i\{0}, with 
deterministic time dependent intensity measure q{dt,dz) = X{t)dt (g) ^{t,dz). Here A: [0,T] —>■ [0, A,nax] 
is the time intensity of jumps with Amax G (0,-|-oo) and, for each t > fixed, ij{t,dz) is a probability 
measure on Z. All processes are defined on a stochastic basis B — {H., !Ft, {^t}o<t<T, P), and, as usual 
in this framework, we assume that the Wiener process and the Poisson random measure are independent. 
The coefficients a: [0,T] xR'^ ^R'^,b: [0,T] x R'^ ^ R'^^^" and c: [0,T]x R'^xZ^R'^ are assumed to 



be Borel functions, satisfying regularity conditions defined in Lemma p.l , 

For a given scalar function g : R*^ ^ R, the goal of our work is to construct approximations to the 
expected value E\^g(^X{T))^ by a Monte Carlo Euler method (cf., e.g., [^). In what respects the 
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dimension, although our developments are valid for all d G N, we have in mind relatively high values of 
d, taking into account the curse of dimensionality present in concurrent deterministic methods that solve 
partial integro-differential equations. The computational approach presented is common in computing 
option prices in mathematical finance and in simulating stochastic dynamics (cf., e.g., |8|, ||3l||). 

Remark 1.1 (Construction of the integral with respect to the Poisson random measure). 

Consider a sequence 61,62,... of independent random variables with common exponential distribution 

with parameter 1. Define 

(1.2) K{t) = [ X{s)ds, t e [0,T]. 

Jo 

The number of jumps of the random Poisson measure p{dt,dz) in an interval [0,<] is determined as 

k 

iV(i)=max|fc: ^ej <A(t)|, 

and the total number of jumps in [0, T] is denoted by N = N{T). The jump times of the Poisson measure 
can be defined by tq — 0, and Tk = A~^(ei + • • • + e^) for fc ~ 1, . . . , N , and can be computed recursively 
by 

(1.3) efe = / X{s)dt, for fc = l,...,iV. 

J Tk-l 

Once the jump times are computed, we proceed to sample the marks {Z^}, that, conditionally on the values 
of the jumps times, are independent random variables distributed respectively according to {fi{Tk,dz)}. 
The random measure with intensity q{dt, dz) — \(t)dt ® fi{t, dz) can then be constructed as 

N 

p{ds,dz) ^^S(^^^^Zk)ids,dz), 

k=l 

and, consequently, the stochastic integral with respect to the Poisson random measure, i.e. the last term 



in (1.1), can be computed as 

N(t) 



/ / c(s,X(s-),z)p(ds,dz) = ^c(Tfc,X(T,7),^fe)' te[0,T]. 
Jo Jz 

Remark 1.2. Since the time intensity X{t) is deterministic, the jump times {rk} can be directly obtained 



before solving for the process X ; it is enough then to find the function A by accurately integrating (1.2) 



and then successively finding {rk} by solving (1.3). Furthermore, if X is just a constant then we .simply 
sample = ei + - • ■+ek from a sequence {e^} of independent random variables with common exponential 
distribution with parameter X. 

1.2. The Monte Carlo Euler method. We now present the Monte Carlo Euler time stepping algorithm 
which will be the building block for adaptive algorithms, with either quasi-deterministic or stochastic time 
steps. For each realization we first construct the jumps and its marks, and conditioned on this information 
we construct the approximate solution X as follows. 
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Monte Carlo Euler time stepping algorithm 

Input: Give a number + 1 of time nodes = to < ti < 



< tjq ~ T , and sample jmnp 



times < ti{ljj) < • • • < rjv(i^) (ti-") < T with corresponding marks Zi(uj), . 
explained in Remark 



1.1 



as 



Set the jump counter k = 1. 

Time stepping: Consider an augmented partition given by the union 

where Na{uj) = N + N{uj) {a.s.) is the number of time steps. 

Sample Xo{lu) and set the initial condition X{to,u!) — Xq{uj). 
For time steps n = 0, . . . , N^iuj) — 1 

Compute the remainder approximate grid values, X{tn+i,Lu), 

by first constructing the left limit value of the approximated process, 



(1.4) 



(1.5) 



X{t^^j^,uj) = X{tn,uj) + a(<„,X(i„,cj))(t„+i - <„) 
eo _ 

+ J2b' (i„ , Xitn ,io)){W' (t„+i , u;)-W^ (i„ ,Lu)). 
1=1 

When needed, introduce the correction due to jump discontinuities: 
if {tn+i = Tfe(w)) then 

X{tn+l,^) = ^itn+l^^) + c(tn+l,^(t;^+i,ti^),^fc(w)), 

increase fc to A: + 1 

else 



(1.6) 

end-if 
end-for 



X(t„+i,w) = X(f„^i,w), 



When the initial time nodes {toj • ■ • j in} are the same for all realizations, we refer to quasi- deterministic 
time steps, or sometimes, simply deterministic time steps. Otherwise, we speak about stochastic time 
steps. 

Besides, in the particular case when a = h = Q the approximate process X has the same law as X, due 
to the form of the grid proposed in the numerical method (see Jisf ) . A similar situation occurs when only 
6 0: there, conditioned to the realizations of N , a higher order method for ODE integration should 
be used to approximate X between jumps. It must be also noticed that the jump intensity A does not 
depend on the current value X{t) of the process. Such a dependence carries the necessity of implementing 
a different Monte Carlo Euler algorithm, where the jumps and its marks are sampled simultaneously with 
the trajectory of the process, generating an aditional error in the approximation scheme, as it does not 
seems possible to have an exact simulation of the jump structure in law, something that is possible in 
the present case. 



1.3. Error Control and Adaptivity. The aim, for a given TOL > 0, is to choose the size of time 
steps, either quasi-deterministic or stochastic, 

Af„=f„+i-t„, n = 0, . . . ,7V - 1, 

and the number M of independent identically distributed samples {^X{- , Wj)}^^^^ such that the compu- 
tational work, defined as M times the average of timesteps, i.e. M x E[Na] = M x {E[N] + A(T)), is 
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minimal, constrained by the condition that the computational error Ec defined by 



M 



(1.7) 

is such that the event 
(1.8) 



Ec = E\g{X{T))\ - ^ ^ .g(X(r, uj,)\ 



\EA < TOL 



has a probability close to one. The computational error Ec naturally separates as the sum of the (deter- 
ministic) time discretization error Et and the statistical error Es given by 



(1.9) 
(1.10) 



E^^E[g{X{T))]-E[g{X{T))], 



M 



The time steps to construct the trajectories X are determined from statistical approximations of the 
time discretization error Et] while the number M of independent realizations of X, is determined from 
the statistical error Es- Therefore, the number M of realizations can be asymptotically determined by 
standard limit theorems for sums of independent random variables. 

Efficient adaptive time stepping methods, with theoretical basis, use a posteriori error information, 
since the a priori knowledge can usually not be as precise as the a posteriori. The present work develops 



adaptive time stepping methods by proving in Theorem 2.3 and Theorem 3.3 error estimates of Et with 
leading order terms in computable a posteriori form. 

where 



The main reference of Theorems 2.3 and 3.3 is the work by Szepessy, Tempone and Zouraris 



similar techniques were applied considering the solution of a stochastic differential equations, i.e. our 
case when c — 0. The general inspiration of the presented results is the work by Talay and Tubaro in 
p9| and its subsequent extension by Protter and Talay in to stochastic differential equations driven 
by Levy processes. 

The main new idea here is the extension of the efficient use of stochastic flows and dual functions to 



obtain the error expansion with computable leading order term in Theorems O and 3^ in presence of 
jumps. The use of dual functions is standard in optimal control theory and in particular for adaptive 
mesh control for ordinary and partial differential equations, see [Q, jl^, and [^, and was 

successfully applied in the probabilistic context in |Q , |2^ , ||ll[ and in [^7| . 
Concerning time steps, this work proposes adaptive algorithms that use either 



"quasi-deterministic" time steps (see section [4.3| ), in the sense that, although the grid is random, 
its randomness only depends on the jumps, being in this sense is a strict generalization of the 
deterministic time step algorithm in ||34| . 

"stochastic" time steps, meaning that the grid is random also with respect to the Wiener measure, 
similarly as with the stochastic time steps introduced in l34l . 



Theorem 2.3 describes the basis of an adaptive algorithm to estimate the computational error; the 



deterministic time steps are then chosen by 



{Mnf\E[p{tn,uj)]\ = constant, 



- observe that we can only control the deterministic points tn and not the whole augmented grid 
{tn{^y\n=o^ ~ where p{tn,(-ij) is the function defined in ( p.9[ ), based on the weight functions tp and 
(p' defined in (A.l) - ( |A.4[ ). Provided the path X{tn), n = 0, ...7V^, is stored, the leading order er- 
ror bound can be evaluated by solving, step by step, the two backward problems (A.l) - (A.4). The 
backward evolutions (A.l) - (A.4) of the weight functions (p and (p' avoid solving for the two vari- 



ables t,s present in which appears in the forward evolution equation for in the identity 

P>i{tn) = djg{X{T))dXj{tn)/dxi{tn), cf. ( ^.34 ). A solution with two variables s and t would require work 



of the order iV^ for each realization, instead of the corresponding work of the order N in Theorem 2.3 



The second algorithm, presented in Section 4.4, is based on the expansion derived in Theorem 3.3 and 
uses time steps which may vary for different realizations of the Wiener process. Stochastic time steps 
are advantageous for problems with singularities at random times. The idea in this case is to choose the 
steps by 

(Af„)^|p(t„, a;)| = constant, 

(where, in comparison with the previous algorithm, there is no expectation), and this is achieved through 
a test performed at each interval of each realization, to decide whether to refine or not the given interval. 
In this case, when a node is added, the interpolation is carried through the consideration of a Brownian 
bridge to obtain the value of the approximated process X in the new added point. 

Besides, since their use entails more work per realization than the deterministic time steps they should 
be judiciously used. A natural application of stochastic time steps appears in the weak approximation of 
killed diffusions, see 0. The optimal stochastic steps depend on the whole solution X{t), < t < T, 
and in particular the step At{t) at time t depends also on W{u) for t < u. In stochastic analysis the 
concept of adaptedness means, intituively, that the values of the process at time t depend only on the 
events generated by the sources of randomness up to time i, i.e. do not depend on future events. In 
numerical analysis a method is said to be adaptive when the approximate solution is used to control the 
error, e.g. to determine the time steps. Our stochastic time stepping algorithm is in this sense adaptive 
and non adapted, since the time steps At{t) depends on values of W{u) for t < u, i.e. on future values 
of the Wiener process. The stochastic time stepping algorithm (that achives higher precision) requires, 
on one side, an aditional theoretical developement including the introduction of Malliavin derivatives, 
and, on the other, requires the approximate computation of derivatives of second order. This second 
requirement is performed introducing the second derivatives ip" of the fluxes (the procedure is described 
in the Appendix). This computation is performed also in linear time. 

The focus in the paper is on error estimates for weak convergence of stochastic differential equations 
for diffusions with jumps. Two particular important cases must be distinguished. In first place, taking 
c — we obtain previous estimates in [ p4| . More relevant in this instance is the situation when a — b — 
and c 7^ 0. In this case, we are considering a pure jump process, with finite number of jumps in the 
time interval [0,T]. From the stochastic point of view, the situation is simpler, and the process X has 
exactly the same distribution as X. This means that the time discretization error is null, and the problem 
reduces to controlling the statistical error. 

Furthermore, the deterministic problems associated with the computation of the expected value 
E[g{X{T))] via the Feynman-Kac formula are non local, in the sense that they involve integro-differential 
equations. Direct discretization of such time dependent integro-differential equations needs to approxi- 
mate integrals at each time step, while the Euler Monte Carlo method avoids these expensive computa- 
tions. 

When d is small and the Fourier transform of X{t) is known (for instance, AT is a Levy process), it 
is efficient to approximate E[g{X{T))] based on Parseval's identity for this transform and the Fourier 
transform of g [^3[ || . The work uses operator splitting to approximate the integral term explicitly in 
Fourier space, while approximating the other terms in the equation implicitly. The work |p5|| discretizes 
the partial integro-differential equation by the 0-scheme in time and a wavelet Galerkin method in space. 
The resulting full Galerkin matrix is then replaced with a sparse matrix in the wavelet basis, and the 
linear systems for each time step are solved approximatively with GMRES in linear complexity. The 
deterministic algorithm gives optimal convergence rates, up to logarithmic terms, for the computed 
solution in the same complexity as finite difference approximations of the standard Black-Scholes equation. 
Other works include |24 , where weak convergence schemes are analyzed, [ p^ where two implicit numerical 
methods for diffusion with jumps are presented, analyzing strong convergence and nonlinear stability; 
and that contains a survey, including some new results on weak approximation for jump diffusion 
equations. 

The technique used here is based on the Kolmogorov's backward equation developed in |Q and |Q 
to analyze uniqueness and dependence on initial conditions for weak solutions of stochastic differential 
equations with jumps. The rest of the paper is as follows. Section ^ proves error estimates for quasi- 
deterministic time steps. Section ^ proves error estimates for stochastic time steps. Section |^ presents 
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implementations of adaptive algorithms and, finally, Section g includes results from numerical experi- 
ments. 



2. An Error Estimate of the computational error with deterministic time steps 



In this section we present in Theorem 2.3 an error expansion in a posteriori computable form for the 
time discretization error E^.- The starting point for the analysis is Lemma below. It uses the fact that 
the Euler method can be extended, for theoretical purposes only, by 

_ _ /■* _ /■*_<. _ 

(2.1) X{t)-X{t,,)^ a{s;X)ds + J2 b{s;X)dW\s) Vi £ [i„, i„+i ), 

_ —I 

where a and h are the piecewise constant stochastic approximations 

(2.2) a(s;X) =a(i„,X(i„)) and 6%;X) = &^(i„,X(i„)) for s€[i„,i„+i). 

Observe that the presence of jumps is realized in X in ( |1.5| ), making possible not to introduce a modified 
coefficient for c. For simplicity we introduce the notation 

_ d _ _ d 

dk = — J dki = — 7{ — , ■ ■ ■ = 1 
oxk oxk.oxi at 

and use the summation convention, i.e., if the same subscript appears twice in a term, the term denotes 
the sum over the range of this subscript, e.g. Cikdkbj = X]fc=i Cikdkbj, and consequently 

For a derivative da the notation \a\ is its order. 

Lemma 2.1. Suppose that for some mo > [d/2\ + 10 there are positive constants k and C such that 

(i) g e C^'(M'^) with \da_g{x)\ < C(l + Ixl") for all \a\ < mo, 

(ii) £;[|X(0)|2'=+'^+i + |X(0)|2'=+'^+i] < C, and 

(iii) a and b are bounded in C™" ([0, T] x R*^) and the same holds uniformly for c(-, •, z) for all z e Z. 

(iv) the initial data X{0) and its approximation X{0) have the same distribution. 



Then, the time discretization error in (l.£) satisfy 



(2.3) f E[{ak{t,X{t-))~ak{t;X))dku{t,X{t-j) + {d,,{t,X{t-))~d,,{t;X))d,ju{t,X{t-))]dt 
Jo 

where 

(2.4) uit,x) = E[giXiT))\Xit)^x]. 
is the cost to go function. 

Remark 2.2. We can relax condition (iv) in the assumptions of the Lemma, and get an additional term 
of the form E[u{0, X{0)) — u{0, X{0))] in the error expansion in (pT^). 

Proof. There exists a unique solution u £ Cj^^{[0,T] x W^) of the Kolmogorov backward equation 
(2.5) 

C^u{t, x) = dtu{t, x) + akdku{t, x) + dkndknu{t, x) + A(<) j [u{t, x + c{t, x, z)) — u(t, x)]ii{t, dz) — 

u{T, ■) = g, 
satisfying the polynomial growth condition 

max^\dau{t,x)\ < C(l + |x|'=+'^) V|a|<6, 



for some positive k and C (of. The Feynman-Kac formula without potential, implies that the 

solution u of (2.5) can be represented by the expected value in (2.4). The Ito formula applied to (2.1) 
(cf. 0, p. 66) gives 



u{T,X{T))-uiO,X{0)) ^ / {^tu{t,X{t-))+a,{t■,X)^^uit,Xit-))+d^J^^Ju{t,Xit-)) 

Jo 

+ X{t) J [u{t,X{t-) + c{t,X{t-), z)) - u{t,X{t-))]fj,{t,dz)}dt 

■ / bl{t;X)d^u{t,X{t-))dW'^{t) 
Jo 



+ 



+ [u{t,X(t^)+c(t,X{t^),z))-u{t,X(t-))]{p{dt,dz)~q{dt,dz)) 



JZ 



which, combined with ( |2.5| ) to substitute dtu{t,X{t )), yields 

u(0,X(0))-(/(X(r)) - j {a,{t,X{t-))^a,{t-X))d,u{t,X{t-))dt 



(2.6) 



{d,j{t,X{t-)) - d,j{t;X))d,ju(t,X{t-))dt 



hi {t; X)diu{t, X{t-))dW\t) 



[u{t,X{t^) + c{t,X{t~),z)) - u{t,X{t~))]{p{dt,dz) - q{dt,dz)). 

Jz 

The expected value of the last two integrals is zero, the first one by the martingale property of Ito 
integrals, and the second one due to the fact that p{dt, dz) — q{dt, dz) is a compensated ra ndo m measure 
(i.e. a martingale measure). Condition (iv) in the Lemma, and the representation of u in (2.4) show that 

E[u{Q,X{0))\ =E[u{Q,Xm]=E[g{X{T))\. 

Therefore, taking expected values in both sides of (p]6h we arrive at the error representation (^.3|). □ 



Lemma 2.1 is combined with stochastic flows to derive the a posteriori error expansion in Theorem p.3| 
below. This error expansion is based on the variations of the processes X and X. For a process X , the 
first variation of a function F{X{T)) with respect to a perturbation in the initial location of the path X, 
at time s, is denoted by 



(2.7) 



F'{T-s) = d,^,)F{X{T))={d^F{X{T)- X{s) ^ x), . . . ,ddF{X(T); X{s)^x)). 



The proof of Theorem 2.3 uses mainly that the error in replacing g{X{T)) in Lemma p.l| by g(X{T)), 
in the representation (2^) of daU, yields the small deterministic remainder term 0(^{At)^)dt in (2 



of Theorem 2.3, which is analogous to the 0{N~'^) term in Talay and Tubaro's expansion, cf. |39|, and 
needs some a priori estimate to be controlled. Lemma 2.1 can be applied to estimate this error. The 
second important ingredient in the proof is the Markov property of X satisfied at the discrete times t„. 
Based on the fact that X{tn) is measurable, the nested expected values 

E[a^{tn,X{tn))d,^^t^)E[g(X{T)) \ TtJ] 



in (2.3) can, by the definition of ip and its implication (2.34), be decoupled to 

E[aj{tn, X{tn))(Pj{tn)], 



which reduces the computational complexity substantially, see Lemma 2.i 

7 



Theorem 2.3 (Error expansion with deterministic time steps). Suppose that a, b, g, X and X , satisfy 
the assumptions in Lemma W\ Then, the time discretization error in (|l.9|) has the expansion 



(2.8) 



■JV-1 



171—0 



E 



N-1 



m=0 



where the leading order error term is in computable a posteriori form. 

1 
2 



(2.9) p(i^,w) = i [(ai(t„+i,X(i„_^i,w)) - ai(t„,X(i„,w)))(p,(i„+i,w)]^^^ 



^ Yl [idtkitn+l,X{t^^j^,Lu)) - d^kitn,X{tn,UJ)))ip',kit„+l,UJ)] 



{Atr. 



with Jm = \n : tm < tn < tm+i}, m — 0, . . . , N — 1, and based on the discrete dual functions (p{tn) G M'' 
and(p'{tn) e M''^'', which are determined as follows. The function ip and and its first variation 

_ dipi{tn;X{tn) = x) 



(2.10) 



dxk 



satisfies (|A.l| ) - ( |A.4| ). 

Remark 2.4. When implementing an algorithm based on this result the expectation of the sum of errors 
in (2.8) is approximated by the mean of the errors along the M simulated trajectories, i.e.: 



E 



'N-l 



7] 



.m=0 



M N-l 



-TT 



j — \ m— 



M 



(At, 



The statistical error of this approximation can be expressed as 



E 



Y p{i,n,uj){Ai 

m ) 



M N-l 



1 \ ^ \ ^ P{tm., ^j) f \Z \2 



j — 1 m—Q 



(Ai^f = / {lM+IlM)dt 



where the distributions of the statistical errors v MIm and v MIIm weakly converge to normal distribu- 
tions with mean zero and time interval dependent variances given by 

Var[ Y K(i«+i,^(i,T+i))-a.(in,^(in)))^«(t,T+i)] =0{At„,), 

n&Jm 

and 

Var[ Y {d7k{tn+i,Xit;^+,)) - d,kitn,X{tn)))ip'^,{t-^,)] =0(Ai,„), 



respectively. 



Proof of Theorem \2. Sj. T he main content of Theorem 2_^ is the replacement of the (non computable) 
estimate in Lemma 2A by an expansion with computable leading order term. For this purpose the 
derivatives of the expected value 

dM^,t) = da,E[giX{T)) I X{t) = x] 
appearing in the integral in Lemma are approximated by the corresponding derivatives of 

uix,t) = E[g(XiT)) \Xit) x], 

that depends on the simulated solution X. The proof is divided into three steps: 
we estimate the quadrature error; 



(i) in Lemma 2.5 

(ii) in Lemma 



its variations; 



2.6 we bound the error in replacing daU by daU with the use of stochastic flows and 

(iii) in Lemma 2.8 we use the discrete dual functions ip and tp' (that solve the backward evolution 

problems see (A.l) - (A. 4) in the Appendix) to derive a computable representation of daU. 
We begin with the first step. 



Lemma 2.5 (Quadrature approximation). Suppose that the assumptions in Lemma 2.1 hold. Let Jm 
{n : < t„ < tm+i}, m = 0, . . . , — 1. Then the quadrature error terms satisfy 



E[{a,{t, X{t- j) - a,{t; X))d,u{t, X{t-))\ dt 



E[ J2 (a.(tn+i,^(i,T+i))-«»(i«,^(tn)))9."(tn+i,^(i,T+i))^] =0((At;„)3), 



E[{d,,{t, X{t-}) - d,,{t; X))d^ju{t, Xit- j)] dt 



Proof. Denote by Q the cr-algebra generated by the jumps and marks in [0, T] constructed in Remark 
Then 



1.1 



E[{d,j{t,X{t-)) ~ d,,it;X))d,juit,Xit-))]dt 

= E[Y. f E[{d,jit,X{t-)) -d,j{t;X))d,ju{t,X{t-)) I g]dt]. 



(2.11) 



Observe that in [i„,i„+i) the conditioned process X has no jump discontinuities, and introduce the 
notations 

h{t,X(t-)) = {d,j{t,X{t-))-d,j{t;X))d,ju{t,X{t-)), 



Then the quadrature error satisfies 

E[h{t,x{t-))-h{t) I g]dt 



£;[(d,,(i,x(t-))-dy(t;X))ft,u(i,x(i-)) I g]dt 



E[{d^jitn+l,X{t-+J) - d.,,{tn;X))d^Mtn+l,X{t-+J) \ Q] 



2 ' 



and E[h{t) \ G] is the Hnear nodal projection of the smooth function E[h{t,X(t )) | Q] in the interval 
[tn,tn+i). Therefore, a standard interpolation estimate yields 



(2.12) 



E[h{t,X{t-))-hit) I g]dt\ < i(At„)2 f ^^\^E[h{t,X{t-)) I g]\dt 



Denoting Ch = dth + aidih + dijdijh, Ito's formula and condition (iii) in Lemma 2.1 show that there exist 
constants Ci, C2 such that, for t e (t„, i„+i) 

d 



(2.13) 



^^E[h{t,X{t-)) I g] = E[Ch{t,X{t-)) I Q] < Ci, 

^E[h{t,x{t-)) I g] = Eic'hit.xit-)) I g] < C2, 



which combined with ( ^ITl) and (I .12|) proves the estimate of the diffusion term in the lemma. The 
estimate of the drift term follows analogously. □ 

In the second step of the proof the derivative daU and its approximation daU are evaluated respectively 
through expected values of stochastic flows of X and X and its variations, that we now introduce. Recall 
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5. 



ik 



the definition (2.7) of the first variation, and let 

1 i = k. 

The following equation for the first variation of the process X at time s > t hold: 

dXr{s) = dka,{.s,X{.s-))X^s-)d.s + dkb'i{s,X{s-))X',^is-)dW\s) 

(2.14) + J^dkC,is,X{s-),z)X'f,jis-)p{ds,dz), 

X[j{t) = 

Similarly, for the second variation of the process X at time s > t we have: 

dX':^^{s) = [a,a,(s,X(,s-))X^;„(s-) + 9,,a,(s,X(,s-))X^^.(s-)X:,„(s-)]ds 

+ [9,5f(s,X(s-))X^',„(s-) + a,.6f(,s,X(s-))X^^.(,s-)x;„(s-)]dM^^(s) 

(2.15) 



+ [9feC,(s, X{s-), z)X'^^^{s-) + dkMs, X{s-), z)X^j.(s-)X;„(s-)]p(ds, dz), 

x;;„(<) = 0. 

For the third variation of the process X at time s > t we have: 

dX':;^^{s) = [5fca.(5,X(s-))X^'j„„(s-) +9fe,a,(s,X(s-))X(,^-(s-)X;'„,„(s-) 

+ afera4s,X(s"))Xfc„(.s")X;^„(s") + afc,.aj(s,X(s~))Xfc„(s~)X"^„(s") 
+ a,™a,(s,X(s-))X[^.(s-)X;„(s-)X„(s-)]ds 

+ [afe6f(s,x(s-))x^';_(,s-) + afe.6f(,s,x(s-))x^^.(s-)x;'„,„(s-) 
+ afc.6f(s,x(s-))x[„(s-)x;;„.(,s-) + afe.6f(s,x(s-))xL.(s-)x;;„(s-) 
(2-16) +a,™&f(s,x(s-))x[^.(s-)x;„(,s-)x;„(.-)]dw^^(s) 

+ / [afec,;(s,x(s-))x-„,„(s-) + a,.c,(s,x(s-),z)x^^.(s-)x;'„,„(s-) 



'z 

+ 9fc,c,(s, x(s-), z)x[„(s-)x;;„(s-) + ^krc^{s, x{s-), z)x(,„ (s" )x;;„ (s" ) 

+ 9/c™Q(s,X(s~),z)Xfcj-(s")X;^„(s")X^„(s")]p(ds,dz), 
and similarly for the fourth variation of the process X at time s > t: 

(2-17) dXl'J^^p = ■ ■ ■ , Xtjnnipi^) = 0- 

This equations imply the representation of the derivatives of expectations with stochastic flows as follows 
(cf. and [^). For the first derivatives: 

(2.18) d^u{t,x) ^ E[d.,g{X{T))Xi,{T) \ Xi^{t) = %,X(t) = x] , 

for the second derivatives: 
(2.19) 

dknu{t,x) = E[d,g{X{T))X[l,{T) + drrg{X{T))X[^{T)X'„,{T) \ X^^^it) - 0,x;^-(t) = %,X(t) - x], 
for the third derivatives: 

(2.20) dknmuit, x) = E[d.g{X{T))Xg,„,{T) + d„g{X{T))X[^,{T)X':^„m 

+ d,rg{X{T))X[^{T)X':^^{T) + d,rg{x{T))xUT)x':^^{T) 
+ d„.g{X{T))Xi^{T)Xl^{T)X[,^{T) \ XX,™(i) = = 0, X[^{t) = X(t) = x] , 

and for the fourths derivatives: 

(2.21) dknrnpU{t,x) = . . . . 

10 



Let Y = {X, X', X", X'", X""Y and let / denote the d x d identity matrix. Then the system ( ^.14H2.17| ) 
can be written 



(2.22) 



dY = A{t,Y{t-))dt + B\t,Y{t-))dW\t)+ I C{t,Y{t-),z)pidt,dz), t > to 
Y{to) = (x, 7,0,0, of . 



Furthermore, rewrite the representation (2.1S-2.21) as 

MY) = dkg{X)XU, 
f,,{Y) = dkg{X)X'^,^ + dkn9{X)X',,X' 
M^{Y) = dkg(X)X'' + dkng[X)X'^,X, 



(2.23) 



njm 

dkn9iX)X'i.jX^^^^ + dkn9{X)X[^-^X" 
dknvg{X)X'f.^X'^ J X'^^ , 



fijmn{Y) — 



The Euler approximation of Y, the solution ( 2.22 ), is denoted by y = (^Y^ ,Y^ ,Y^ ,Y^ ,y'^ )^ and can be 
extended as the solution of the stochastic differential equation with piecewise constant drift and diffusion 
fluxes 

dY^A{t-Y)dt + B\t-Y)dW'''{t)+ I C(t,Y,z)p{dt,dz), 



(2.24) 

J z 

defined as in ( |f .4] ), ( pT^ ) and ( ^.fj ), with coefficients defined as in (p^). 

An important consequence of the Euler method is that the variation and the Euler discretization 
commute. This yields for each a the representation 

(2.25) do^u{t,x) - dMt,x) ^ E[UY{T)) - UY{T)) \ Y{t) =Y{t) = (.t, J, 0, 0, 0)^] . 

Now we rely on Lemma 2.f applied to the process Y to obtain the representation 

(2.26) 

E[UY{T))-UY{T)) I Y{t)^Y{t) = (a;, /, 0, 0, 0)^] 



E 



{A- A)kdkv''{s,Y{s-)) + {D- D)kndkn'y"{s,Y{s-)) \ Y{t) = (x, 7,0, 0,0)^ 



ds 



to be used in Lemmas 2.6 
as 

we use the notation 



and 2 



below, where the cost to go function is, for each fa in ( 2.23 ), defined 
,.'^{t,y)^E[UiYiT))\Yit)^y], 

Dkn = ^B^B^, Dkn = \Bf,B.^, 

and the corresponding Kolmogorov Backward equation is 

= dtv-^ + Akdkv'' + Dkndkniy" + A(t) / [i^"(t, Y + C{t, Y, z)) - iy"{t, Y)]^l{t, dz) = 0, < < T, 



We are ready for the second step. 

Lemma 2.6 (Approximation of dau). Let the piecewise constant mesh function At be defined by 
At{s) = At„ for s e [tn, tn+i) and n — 0, . . . , iV^(a;) — 1. 



Suppose that the assumptions in Lemma 2.1 hold. Then the discretization errors of the stochastic flows, 
for \a\ < 4, satisfy 

(2.27) da{u - u){tn,X{tn)) - ^ E[0{At{s)) I J^tjds = 0{Ai,^a.)- 
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Furthermore 



(2.28) 



£;[(a,(i„+i,x(i;^,)) -a,(i„;X))(9,7/(t„+i,x(i,;+i)) -a,s(t„+i,x(i;+i))) | g] 

= Atn / 0{At{s))ds, 



(2.29) 



i-T 



Atr. 



0{At{s))ds. 



Proof. The combination of (2.25) and ( ^.26 ) give 



(2.30) dc.{u-u){tn,X{tn)) = E[{M- A,)d^u''{s,Y{s~)) + [D,,-D,j)d^ju''{s,Y{s-))\Tt„]ds 

Now, for tm < t < tm+i, introduce the notation 

h{t,Y{t)) = {A,~A,)d.,J^''{t,Y{t)) + {D,j^D,,)d,,,y"{t,Y{t)), 

and 

Caw{t, y) = {df W + AndnW + Dkndknw) (i, y) 

C^w{t, y) = Cow{t, y) + \{t) / [w{t, y + C{t, y, z)) - w{t, y)]^i{t, dz). 



Observe that h{tm,Y{tm)) — 0, apply Ito's formula in the interval [tm,t] to obtain 



E[hit,Y{t~)) \ Tt„ 



E[C^h{s, Y{s')) I Ttjds = 0(Ai,„), i„ <t< tm+l, 



where the bound follo ws as in ( 2.13 ). This plugged into ( ^.30 ) proves ( |2.27| ). 
The estimate ( 2.29 ) follows similarly by defining now 

h{t,X{t)) = {d,j-d,j)d^j{u-u){t,X{t)). 

Then the Ito's formula shows as in ( ^.12| - pTT3 ) 



E[h{t,X{t)) \Q]^ j E[Coh{s,X{s-)) I g]ds, t„<t< t„+i. 
The final step to prove ( ^.29| ) is to establish 



E[Cohis,X{s-))\g] 



0{At{T))dT. 



The function Cq h{s, X{s )) splits into the two types of terms hi = {d ~ d)v and /12 = vda{u — u), with 
smooth functions v of (s, X{s)). The Ito formula again shows that 



E[hiis,X{s-))\g] = J E[Cohi{T,X{T-))\g]dT = OiAt^), i„<s<t„+i. 
Moreover ( 2.27 ) implies 

E[h2is,X{s~)) \Q]^ 0{At{T))dT, 

and consequently (2.2S) holds. The estimate ( 2.2^ ) of the drift terms follows analogously. 



□ 



Definition 2.7 (Local discrete solution operator). Write the Euler time stepping for the time nodes 
t — to,t^ ,ti,t2 ,t2, ■ ■ ■ as 

(2.31) X{tnext,i0) = (p{t,X{t),Uj) 
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where the next computation time is given by 

. _ / *n if t^t:^ 

Observe that depending on uj and t, we may have 

{x + a{t,x)M + b\t,x)/\W^, if t = tn, 

X + c{t, X, Zk{oj)), if t = and t„ is the k-th jump time, 

X if t = t^ and t„ is a not a jump time. 



Lemma 2.8 (Representation with discrete duals). Suppose that the assumptions in Lemma 2.1 hold. 
Then the dual functions f and ip' , defined Appendix satisfy for t — tn and t — 

(2.32) d,u{t,X{t)) ^ E[ip,it) \ Tt], 



(2.33) 



d,,uit^X{t))=E[p',^{t)\Tt]. 



Proof. Equations (2^), ( 2.14 ) and ( 2.18| ) show that the first variation of the Euler approximation X is 
in fact equal to the Euler approximation of the first variation X' and consequently 

dMt.Xit)) = E[d,g(X{T))xUT;t) \ Tt\ 



where Xj^{s;t) (s > t), is the Euler approximation ( 2.24 ) of X' with initial data Xj^{t:t) = Sjt. 
Let, for t = ... ,i„,i;^+i, . . ., 

^,{t) = 9,^(,)5(X(T)), 

i.e. 



(2.34) 



p,{t)=d,g{XiT))X^,{T;t). 



We prove inductively that ipi{t) is the solution of the corresponding problem in (A.1)-(A.4). Since (A.l) 
is trivially true it remains to prove the inductive step. By the chain rule we have 

dkg(X{T))xUT; t) ^dkg(XiT))%j{T; t„,,t)x],{tne.t;t) 

or in other words 



(2.35) 



ip^{t) = ipj{t„^^t)d.i^jit,X{t)) 



which is equivalent to (A.1)-(A.4), what we wanted to prove. 
The equality (2.34) implies that 

d,,u{t,X{t)) ^ E[d,^(^t^ip,{t) \ J^t]- 
The next step is to verify that the first variation of (p, 

lo iR\ - A u\ - 9(Px{t;X{t) ^ x) 
(2-36) ip.^j{t) = d.^^i^t)'Pi{t) = , 

satisfies the backward recursive equation (A.1)-(A.4). First, differentiate the equation ( 2.35| ) to obtain 
^',,{T) = d,,g(X{T)). 



(2.37) 



Observe that the problem ( |A.l| )-( A.4) shows that p{tnext) depends only on the point values 

{X{s) : t„,,t <s<T}, 

so that 

(2-38) dx^{t)fj{tnext) — dj.^(^t„^^t)'Pj{tnext)dx^{t)Xp{tnext)- 

Finally, the definitions of X and <f> in ( |2?l| ) and ( |2.3l| ) imply 

(2.39) dk%{t,X{t)) ^ d,,,^t)Xp{tnext), 
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which together with (2.37-2.38) prove that ip' satisfies the recursive equation 
v[k{T)^d.,k9{X{T)), 



(2.40) 



which is equivalent to ( [A.l| )-( A.4). □ 

Remark 2.9. The measurability of (^ai{t^^-j^, X (t^^-^^)) — ai{tn, X (tn))) E T^- proves that for t — t^_^_-^ 
and any random variable [3 we have 

E[{a,[t,X{t))-a,[t,,,X{tn)))E[l3 \ Tt\] = E[E[{a,{t,X(t))~a,{t,,:X))(i \ Tt]] 

^ E[{a,{t,X{t)) -a,{t^-X))l3\, 

and in a completely similar way we obtain the same result for the diffusion terms, i.e. 

E[{d,jit,X{t)) -d,jitn:X))E[l3 I Tt]] = E[{d,,{t,Xit))-d,j{t„;X)))(3]. 



The proof of Theorem 2.3 is now concluded by combining Lemmas 2.5, 2.6,2.8, Remark 2.9, and the 
Central Limit Theorem to estimate Im and IIm- We have 



E[{a,{t,Xit-))-a,{t;X))d,uit,Xit-))]dt 
=E[ {a,it,,+,,Xit-^,)) - a,it„,X{tr.)))dMtn+uX{t-^,))^] 

N-1 

^E[ {a.{tn+i,X{t-^,)) - a,it„,Xit„))dMtn+uXit-^,))^] 

N-1 „T N-1 

+ E ^[ E (^^")' / 0{At{s))ds\ + Y 
=E[ Y (a.(in+i,^(i,T+i)) - a.{tn,X{tn))E[ip,it„+i^) \ ^t-J^ 

N-1 „T ^^-1 

+ Y^^T. (^^")' / 0{At{s))ds] + J2 0{{Ai„,f) 

m=Q neJm -Jtn + i 

=E[ J2 (aKWi,^(i,T+i))-a.(in,X(t„))(^,(t„+i_)^] 

N-1 „T N-1 

+ E ^[ E (^*»)' / 0{At{s))ds] + Y 0{{Ai,n)') 
The expansion of the diffusion term appearing in (2.9) follows analogously. 



□ 

-1 



Observe that the number of realizations to determine a reliable error estimate is in general TOL 
much smaller than the, proportional to TOL~^, number of realizations to approximate E[g{X{T))]. For 
more details on this and the statistical approximation of the error density p see Remark 2.7 in p^ . 

3. An Error Estimate with Stochastic Time Steps 

This section derives error estimates with time steps which are stochastic and determined individually 
for each realization by the whole solution path X. The analysis will use the Malliavin derivative, 9vF(t)^i 
which is the first variation of a process Y with respect to a perturbation dW{t), at time t of the Wiener 
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process, cf. The Mahiavin derivative for a stochastic integral X is related to the first variation, 

d^(^fjX, for a perturbation of the position at time t by 



(3.1) 



T > t. 



if dXk = ak{X{t))dt + bi{X{t))dW'^ (cf. (^). 

We shall restrict the analysis to time steps which are constructed by first sampling the jump times to 
augment an a priori given time-discretization At, obtaining At[0] (see (3.6)) and then using the refinement 
criterion 



(3.2) 



At{t) = Ai[0](i)/2", for some natural number n = n{t,uj), 



\p{t,uj)\{At(t)Y < constant. 



with an approximate error density function, p, satisfying, for s € [0,T], t G [0,T] and all outcomes u, 
the uniform upper and lower bounds 



(3.3) 



c(TOL) < \p{s,Lu)\ < C(TOL), 
\dwit)P{s,Lo)\<C{TOh), 



for some positive functions c and C, with TOL/c(TOL) ^ as TO L 0. For each realization successive 
subdivisions of the steps yield the largest time steps satisfying (3_^). The corresponding stochastic 



increments AW will have the correct distribution, with the necessary independence, if the increments 
AW related to the new steps are generated by Brownian bridges, cf. [^, i.e. the time steps are generated 
by conditional expected values of the Wiener process. 

Let ^ be a constant approximating , where E[N] is the expected number of steps. The analysis 
in this section with adaptive non adapted time steps, satisfying ( |3.2| )-(3.3), is based on the following 
Stochastic time step algorithm described in next page. 



Section 4.4 presents a more precise formulation of this algorithm. Lemma 5.1 and Theorem 3.3 below 
show that although the steps generated by (|3.2D-(3.S) through the algorithm above are not adapted, the 



method indeed converges to the correct limit as the forward Eulcr method with adapted time steps. 



Lemma 3.1 (Strong convergence). Suppose that a,b,g,X satisfy the assumptions in Lemma 2.1 and 
that X is constructed by the forward Euler method, based on the stochastic time step algorithm above, 



with step sizes At„ satisfying (3. 
Assume also that X(0) = ^(0). Tk 



3.0 ) and their corresponding AWn are generated by Brownian bridges. 



sup ^jE[\Xit)-X{t)\^]=Oi 

0<t<T 



^tsup) ^OU 



I TOL 
c(TOL) 



)-o. 



as TOL — !■ 0, where At sup = sup„ ^ At„(a;). 

Proof. Let Q be the cr-algebra generated by the jumps and marks constructed in Remark |l.l| . Consider 
the conditional expectation E\\X(t) — X{t)\'^ \ Q] and apply Lemma 3.1 from using also that there 
is no time discretization error at the jump nodes. □ 



In addition to the dual functions and tp' in Theorem p.3[ , the new error expansion for stochastic time 
steps in Theorem 3.3 below also uses, for t — the discrete dual variation 



(3.4) 



which satisfies the backward problem (A.l) 



// /,\ » ' l4-\ dip'-,.{t;X{t)=x) 

1), i.e.. 
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Lemma 3.2. Let <f>, and Lp' he defined by (2.31) and (A.l) - (A. 4). Then ip" is given by 

next J 
next J 
next J 

+ dikm'i>j(t,X{t))ipj{tnext), t < T, 

a,fo„5(X(T)). 



(3.5) 



which is equivalent to (lA.l| 



(A.4) with tnext as in Definition 2.7. 



Stochastic time step algorithm: 
Do for M realizations ojj, j = I, . . . , M: 

Sample jump times: < Ti{ujj) < • • • < t ^ ^^{uj j) < T. 

Consider an augmented partition given by the union 



(3.6) 



At[0] = {t,M)}l=o'' = {4}:=o U {r„(c.,)}r'"'' 



that has N^iuij) = N + N{ujj) (a.s.) time steps. 

STEP 1: Set fc = 0. Start with the initial coarse mesh At[0] and compute 
AW[0]. 

STEP 2: For the piecewise constant mesh function At[k] with corresponding 
noise AVF[fc], compute X[k] and the weight function p[k] defined in (3.8-3.9). 
STEP 3: Define r{t) = \p[k]{t)\{At[k]{t))'^ and let for all t 

( At[k]{t), ifr(t) <<5, (t) 
\At[k]{t)/2, ifr{t)>S, (*) 

and in the refinement case (*) construct AM^[A:+ 1] by Brownian bridges based 
on the already known AVF[A:]. 

STEP 4: If at least one step of At[k] is refined by (7k-), increment fc by 1 and 
goto Step 2. Else all steps of At[k] satisfy (f) and accept the approximation 
g{X{T,ujj)) and goto the next realization of p and W. 
Enddo 

If the statistical error, E[g{X{T))] — jj 5(^(^i^j))i sufficiently small 

stop, else restart with a larger M. Endif 



At[k + l]{t) = 



Proof. Differentiation of the backward recursive equation (2.40) and the relations ( p. 34 -2. 39) together 
with 



(3.7) 



dx^(t)V'jp{inext) — 9x^{t„^^t)^'jpi'tnext)dx,„(t)Xritnext) 



prove as in ( 2.3(: )-( 2.39 ) that ip" satisfies (3^). Here, (3^) holds since the linear system for the variable 
{^{tnext),^' (tnext)) dcpcnds Only on the point values {X{s) : tnext < s < T}. □ 

The following theorem derives an error estimate applicable both to adaptive deterministic time steps 
and to the stochastic time step algorithm; the assumptions and the proof of the theorem focus on 
stochastic steps, however a modification to deterministic time steps is straightforward. The computable 
error density \p\ of this error estimate can then be cut-off for small and large values to satisfy ( |3.3[ ), see 
( p7| ) and (Q. 

Theorem 3.3 (Stochastic time steps error expansion). Suppose that a,b,g,X satisfy the assumptions in 
Lemma 2.1 and that X is constructed by the forward Euler method with step sizes Atn satisfying {3.i- 



3.0) and the corresponding AWn are generated by Brownian bridges, following the stochastic time step 



algorithm in Lemma 3.1. Assume also that X{0) = X(0) and E^X{Q)\^°] < C for some ko > 16. Then 
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the time discretization error has the following expansion, based on both the drift and diffusion fluxes and 
the discrete dual functions ip, (p' and ip" given in (A.l) - (A. 4), with computable leading order terms 

E[g{X{T))-g(X{T))] = e[J2 p(<n,X)(At„)2 

(3.8) 



n=0 



ji=0 



where 



p{tn,X) EE ^(^{dtOk + djUkaj + dijakdij)'pk{tn+i) 
(3.9) + [dtdk-ni + djdkmOj + dijdkmdij + 2djakdj„i)ip'k„^{t~^-^) 

+ {^djdkradjr)'Pkmr{tn+l)) ^ 



and the terms in the sum of ( |3.9| ) are evaluated at the a posteriori known points {tn, X{t„)), i.e. 

daa = daa{tn,X{tn)), 
dab = dab{tn,X{tn)), 
dad EE dad{tn,X{tn)). 

Proof. We consider the difference 

giXit))-g(X{t)) 

and apply Theorem 3.3 from using also that there is no time discretization error at the jump nodes. 
To this end, denote the set of stochastic time nodes by = {0 = to, ti ,ti,t2 , t2, ■ . ■ ,tN — T} and recall 
that the notation 

(3.10) X{tne.t) = ^X{t)), t e J, 



introduced in Definition 2.7 denotes one step with the Euler method. Write similarly one step with the 
exact solution 

(3.11) X{tnext)=HX{t)), tej. 

Introduce the notation X* = X{t) and X* = X{t). Now verify the representation 

_ Na-I _ _ 

(3.12) g{X{T)) ~ g{X{T)) = ^ (^(X(i„)) - <i>(X(t„))),^.(i,T+i) 
where the weight functions are defined recursively by the linear backward recursion 

V^{T)^ f d.,g{sX{T) + {l-s)X{T))ds, 

(3.13) 

Ht) = ( / d,^jisX{t) + (1 - s)Xit))ds^p>jitnext), t € J. 



To verify (3.12), first observe that by construction of the Euler method at every jump point t = i„ there 
is no local error in computing the next X value at time t„, i.e. 

^(X*) - $(X*) = 



so ( 3.12 ) is equivalent to 

(3.14) giX{T)) - g{X{T)) - ^($(X*) - '^(X^^tne.t). 
Then telescoping cancelation gives 

(3.15) g{X{T))-g{X{T)) = ^((X*-* -X'-%^,{tne.t) - {X' -X%Ht))- 

tej 

17 



Use the definitions (3.1C,3.11) and split the first term in the sum of (|A5|) into 



The two first terms above and the last term in the sum of ( |3.15| ) combine to zero by (3.13) 

(l>(X*) - Mx'))^'f^itne.t){X' -X\^,{t), 

which proves ( 3.12] ) 



The next step is to use the Malliavin derivative to analyze the expectation of the representation (3.12) 
by studying the dependence of X and on a small increment dW. This follows exactly the lines of the 
proof of Theorem 3.3 from |Q, and it is not reproduced here. □ 

4. Adaptive time-stepping algorithms 



Here, we describe two adaptive time-stepping algorithms for the weak approximation problem of (1.1) 
based on the approximation error formulas described in the previous section. They are very similar 
to those introduced in Algorithm-D is based on a quasi-deterministic mesh that is fixed for all 

realizations and its adaptive strategy is based on averaged information from the a posteriori error formula. 
On the other hand, Algorithm-S can adapt the time discretization differently for each realization. Proper 
sample of the Wiener process is possible by means of Brownian bridges. Both adaptive algorithms choose 
adaptively the number of realizations and the size of time steps to efficiently bound the approximation 
error by a prescribed error tolerance. 

4.1. Computational error splitting. The weak approximation computational error of the Monte Carlo 
Euler method, £c = E[g{X{T))\ — ■^'^jLi g{X{T,ujj)) , naturally separates to the time discretization 
error = E[giX{T))] - E[g(X{T))] and the statistical error £s = E[g(X(T))] - ^ J2f=i 9{'X{T,lJj)), 
i.e., £c = £t -\- £s- Thus, the control the computational error is related to the a combined control of the 
time discretization error via the choice of the time steps At and of the statistical error via the choice of 
the number M of the realizations. Therefore, we split a given computational error tolerance, TOL > 0, 
into a statistical error tolerance TOLg and a time discretization error tolerance TOL^ (see |Q, ||2^) by 

(4.1) TOW = i TOL and TOL^ = |TOL. 

4.2. Control of the statistical error. For M independent samples {Y{u)j)}'-Li of a random variable 
y, with i?[ < cx), define the sample average by 

1 

and the sample standard deviation by 

S{Y; M) = {A{Y^: M) - {A{Y; M)fY . 

Let (7y = \e[\Y - E\Y]\^]Y and Z^, = ^ {A{Y]M) - E\Y]) with cumulative distribution function 

Fz,„{x) EE P{Zm <x), X ^ R. Let A ee {E[\Y - E\Y]\^]Y/^ / ay < oo, then the Berry-Esseen theorem, 
cf. p^ , gives the following estimation in the Central Limit Theorem 

snp\Fz,,{x)-JV{x)\ < -^X^ 

for the rate of convergence of Fzj^., to the distribution function A/" of a normal random variable with mean 
zero and variance one, i.e. 

(4.2) J\f{x)^ f -^eiipf-l-sA ds. 



27r V 2 

Since in the examples below M is sufficiently large, i.e. M ^ 36A®, the statistical error 

£s{Y;M) = E[Y]-AiY;M) 
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satisfies, by the Berry-Esseen theorem, the following probability approximation 



P 



\£s{Y;M)\<c„ 



ay 



2M{c„) - 1. 



In practice choose some constant > 1.65, so the normal distribution satisfies 1 > 2M{c^^) — 1 > 0.901 
and the event 



(4.3) 



\£s{Y-M)\<Es{Y-M)^c, ^^^'^'^^ 



'M 



has probability close to one, which involves the additional step to approximate ay by S{Y;M), cf. JTsf . 
Thus, in the computations '^s{Y] M) is a good approximation of the statistical error EsiY', M). 

For a given TOLg > 0, the goal is to find Af such that Es{Y;M) < TOLg. The following algorithm 
adaptively finds the number of realizations M to compute the sample average A{Y; M) as an approxima- 
tion to With probability close to one, depending on c^, the statistical error in the approximation 
is then bounded by TOLg. For technical reasons (see |Q) we choose M = 2", n G N. 

routine Monte-Carlo(TOLs, Mq; EY) 

Set the batch counter m — 1, M[m\ = Mq and Es[m\ = +oo. 
Do while (Es[m] > TOLs) 

Compute M[m\ new samples of Y, along with the sample 
average EY = A{Y; M[m]), the sample standard deviation 
S[m] = S{Y; M[m]) and the statistical error estimation 
Es[m + 1] = Es(y, M[m]). Compute M[m + 1] by 
change_M (M[m], S[m], TOL^; M[m+l]). 
Increase m by 1. 

end-do 

end of Monte-Carlo 



(4.4) 



MCH X M„ 



(4.5) 



routine change_M (Mi„, Sin, TOLs; Mout) 

M* — min|integer part ^^ '^^^^ ^ ^ 
n = integer part (loga M*) + 1 
Mout =2". 

end of change_M 

Here, Mq is a given initial value for A/, and MCH > 1 is a positive integer parameter introduced to 
avoid a large new number of realizations in the next batch due to a possibly inaccurate sample standard 
deviation S[ni]. Indeed, A/[m + 1] cannot be greater than MCH x Af[m]. 

4.3. Deterministic time stepping algorithm. F ollow ing closely ]29| a nd |p7| , we present an adaptive 
algorithm based on cut-off of the error density p in (3.£) of Theorem |3.3|, po, which is defined as 



(4.6) 



in I max I j^j^ 



,TOL^ ,TOL" 



n = 1, 



,iV. 



The error expansion in Theorem |3.3| motivates us to approximate the time discretization error by 

N 



(4.7) 



\^t\<E 



where the error indicator, r„, is defined by 

(4.8) r„ = pS(Af„)2 n = l, 

The main advantage of the deterministic time stepping algorithm over the stochastic time stepping 
algorithm is that the number Mt of realizations necessary to determine the optimal deterministic stepping 
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scheme is considerable smaller than M , whereas, in the stochastic time stepping algorithm, a refinement 
of the partition is carried out in each one of the M trajectories, leading to a considerable larger amount 
of computational work. 



As pointed out before, the error expansion derived in Theorem 3.3 is also valid with deterministic 



time steps. Therefore, we have some flexibility in the choice of error densities for the adaptive algorithm 
with deterministic time steps because we can also use the results from Theorem |2.3|. There are some 



practical differences to mention. On the one hand, the variance of the averaged error density from (2.£) is 

0{ ^ - - 



^~ ) . This feature has been observed in and where a local filtering procedure was proposed 
to reduce the variance of the error density estimator. A positive feature of this error density is that it 
does not require the computation of the second variation, ip" , which may be computationally expensive 
for large d. On the other hand, the averaged error density from ( |3.9D has a much smaller variance 0{j^) 
which does not need filtering but it requires the computation of if". In this work we will not discuss 
further this choice and only show numerical results with adaptive deterministic time steps based on the 
error density (BJh. 



The approximation of the time discretization error in the right hand side of (4.7) can be separated 
into two parts 



(4.9) 



E 



■ N 

E 



N 



E 



N 



where the second error term in the right hand side of (4.9) is with probability close to one asymptotically 
bounded by 



(4.10) 



E 



■ TV 

E^ 

.n=l 



N 



< 



Ets = Co 



S{j:n=irn;MT 



and the first term defines Ett = A (^^=1 . Then for a given TOL^ > 0, the goal is to construct a 

partition At of [0, T], with as few time steps and realizations Mt as possible, such that E^^^+Ets ^ TOLy 
To this end, first split the time discretization tolerance TOL^ in two positive parts TOL^t and TOL^s 
for Ej-T and E^s, respectively. The statistical error of the time discretization using the density ( |4.6| ) is 
0{^^=r). Therefore the percentage of the tolerance, TOL, devoted to the control of the statistical time 



discretization error can be arbitrary small as At 



sup 



0. In practice we choose 



(4.11) 



TOLtt = ^TOLt = ^TOL, 
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TOLts = ^TOLt = ^TOL. 



The control of the statistical time discretization error determines the number of realizations Mt necessary 
to ensure a reliable choice of the time discretization in the deterministic time stepping algorithm. Similarly 
as in pl| ], here it is optimal to equidistribute the error contributions from different time intervals. Thus, 
the goal of the adaptive algorithm described below is to construct a deterministic time partition At of 
[0, T] such that 



(4.12) 



-4(r„;MT)<di ^ , n = 1, . 



,,7V, 



where di — 2, see Remark 3.9 in 



To achieve (4.12), start with an initial partition At[l\ and then specify iteratively a new partition 
At[k + 1], from At[fc], using the following refinement strategy: 



(4.13) forn= l,2,...,iV[fc] 

(4.14) if f„[fc] > di F7~j~i then divide At„[fc] into 2 uniform substeps. 

N[k\ 

else let the new step be the same as the old 

(4.15) endif. 
endfor. 

until the following stopping criteria is satisfied: 

/ _ TOLttA 

(4.16) if max r„[fcl < Di — — then stop. 

Here Di is a given constant satisfying Di > -di where c « 1/2, see p^. The combination of (|4.9|) and 



(4.16) asymptotically guarantees a given level of accuracy, Ett < DjTOLtt- The posit ive numbers Di 



is motivated to avoid slow convergence in case almost all r„ satisfy (4.16), as in Section 



4.4 



Now we are ready for the detailed definition of the adaptive algorithm with deterministic steps: 

Algorithm D 
Initialization Choose: 

(1) an error tolerance, TOL = TOLs + TOLtt + TOLts, 

(2) a number, N[l], of initial uniform steps for [0, T], 

(3) a number, M[l], of initial realizations and set Mt[1] = Af[l], 



(4) a number, di = 2 in (4.14) and c = 1/2 + 1/20 to compute Di using Di > -di, and 



(5) a constant cq > 1.65 and an integer MCH> 2 to determine the number of realizations in (l.f;) 

Set the iteration counter, fc, for time refinement levels, to 1 and set the statistical 
error, Ets — +oo and f[k] — +oo. 



Do while ( r[k] violates the stopping (4.16) or Ets > TOLts ) 
Compute the sample averages and the error estimates on At[k] 
by caUing Euler. Set Mrik + 1] = M rjk] and At[k + 1] = Ai[k] 
If ( f[k] violates the stopping ( 4.16| ) 



For all time steps i = 1, . . . , N[k], do the refinement process (4.14) 
to update Ai[k + 1] from Ai[k]. 
elseif ( Ets > TOLts ) 

Update Mrik + 1] by change_M (Mt[A:],5ts[^], TOLts; Mrlk + 1]). 
end- if 

Increase fc by 1. 
end-do 

Compute an approximation. Eg, for E[g{X{T))] with fixed time mesh At = At[k] 
by Monte-Carlo(TOLs, A/tW; Eg) in (g^. 

Accept Eg as an approximation of E[g{X{T))], since the estimate of the 
computational error is bounded by TOL. 



routine Euler 

For each Af^ [fc] new realizations, sample jump times with their corresponding marks. 
As described in Section 1.2, use At[k] to compute corresponding realizations of the Euler method. 
Update the approximations of the time discretization error 
indicators f[k] and the statistical time discretization error Ets[^] and 
compute the sample standard deviation Sxsik] = S{g{X{T)); Mrik]). 
end-of-Euler 
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4.4. The stochastic time stepping algorithm. Now we describe an adaptive algorithm with stochas- 
tic time steps based on a cut-ofF of the error density p introduced in (3.£) of Theorem |3.3| , ps, defined by 
as 

(4.17) = min(^max(^|p(t„,X)|,TOL*) ,TOL"^^ , n^l,...,N^. 



Following the error expansion in Theorem 3.3, the time discretization error is approximated by 

(4.18) < E 
where the error indicator, r„, is defined by 

(4.19) r„ = pS(A^„)^ n = l,...,iV^. 

In this case it is optimal, cf. ||29|l , to equidistribute the error contributions among all time steps and all 
realizations. In other words, the goal of the adaptive algorithm is to construct a time partition At of 
[0, T] for each realization such that 

(4.20) r„<sii^, n = l,...,iV^, 

where si = 2, see Remark 3.1 in |2^. Note that in practice the quantity E[Na] is not known and we can 
only estimate it by a sample average A{Na; M) from the previous batch of realizations. The statistical 
error |£'[A^^] — A{Na; M)\ is then bounded by E5(iV^;M), with probability close to one, by the same 
argument as in (4.3). 

Let A^Ab] = A{Na; M[j]) be the sample average of the final number of time steps in the j-th batch 
of M[j] realizations. To achieve (4.20) for each realization, start with an initial partition At[l] and then 
specify iteratively a new partition At[k + 1], from Ai[fc], using the following refinement strategy: 

for each realization in the j-th batch 

for each time step n — 1 , . . . , Na [k] 

if r„[fc] > si=S2L2_^ then divide Atn[k] into 2 uniform substeps. 

(4.21) else let the new step be the same as the old 
endif 

endfor. 
endfor. 



The refinement strategy (4.21) motivates the following stopping criteria: for each realization of the 
j-th batch 



(4.22) 



iff max rJk]<Si^^^] then stop. 



where Si > ^ Si with c w i, see [pOf 



Now we are ready for the detailed definition of the adaptive algorithm with stochastic steps: 

Algorithm S 
Initialization Choose: 

(1) an error tolerance, TOL = TOLs + TOLt, 

(2) a number N[l] of initial uniform steps At[l] for [0, T] and set Na = N[l], 

(3) a number M[l] of initial realizations, 

(4) a number si — 2 in (4.21) and c = i -f ^ to compute Si using 5*1 > | si, and 

(5) a constant cq > 1.65 and an integer MCH> 2 to determine the number of realizations in (4.E). 
Set the iteration counter for batches m = 1 and the stochastic error Es[m] = +oo. 

Do while ( Es[m] > TOL5 ) 

For realizations j = 1, . . . , M[to] 
Set k = 1 and r[k] — +00. 
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Generate the jump times and their marks (r, Z) — {(t£, Zg)}^^^. 
Start with the initial partition At[k] and generate AT/F[fc]. 
Compute g{X(T))[J] and N[J] by routine Control-Time-Error, 
end-for 

Compute the sample average Eg = A {g{X{T));M[m]), the sample 
standard deviation S[m] = S{g{X{T)); M[m]) and the a posteriori bound 



for the statistical error Es[m] = Es{g{X{T)), M[m\) in ( [4.31) . 
if ( Es[m] > TOLs ) 

Compute M[m + 1] by change_M(M[TO], S[m], TOLg; M[m + 1]), of. 

(|4.5|) , and update A^a = A {Na[J]; M[m]), where the random variable 
[J] is the final number of time steps on each realization, 
end- if 

Increase m by 1. 
end-do 

Accept Eg as an approximation of E[g{X{T))], since the estimate of the 
computational error is bounded by TOL. 

routine Control-Time-Error(At[fc], AM^[fc], r[fc], (r, Z); g(X{T))[J], N[J]) 
Do while ( r[k] violates the stopping ( 4.22| ) ) 



Compute the Euler approximation X\k] i n Section and the error indicator 
r[k] in ( |4.8| ) using the error density ( 4.17 ) on At[k] with the known 



Wiener increments AV1/^[A:]. 

If ( r[k] violates the stopping ( [4.22| ) ) 

For time steps i = 1, . . . , N[k] 

Do the refinement process ( |4.2l| ) to compute At[k + 1] from At[k] 
and compute AM^[A: + 1] from AVF[/c] using Brownian bridges, 
end-for 
end-if 

Increase fc by 1. 
end-do 

Set the number of the final level J = /c — 1. 
end of Control-Time-Error 



5. Numerical Experiments 

This sections shows numerical results from the implementation of the a posteriori error approximation 
formula presented in Section |^ and of the adaptive algorithms described in Section ^. The programs we 
wrote uses double-precision FORTRAN 77 and is based on the code written for the numerical experiments 
in [3^ . For the numerical simulation of the uniform distribution U{0,1) and the normal distribution 
^(0,1), it applies a double-precision modification of the functions rani and gasdev proposed in Js^ , 
provided an initial seed which must be a negative integer. In particular we use iseed for the simulation of 
the Wiener process increments, zseed for the simulation of the jump marks and tseed for the simulation 
of the jump times. 

To perform our computations, we consider a system of stochastic differential equations of the form 



( |l.lD with: d = 2, 4 = £i = 1, 

a{t,x) ^ { - X2, xi + ^ X{t) X2 ), b^{t, x) = \J sin(a;i), 0^, 

c{t,x,z) = {Q,z^-^-x,) 

and a time dependent intensity \{t) = (1 -I- . The distribution for the jump marks is time dependent 
and such that E[Z'^] — 1. In particular, we use 

Zk = cos(27rrfc) + sin(27rTfc) 2 VS (t/fc - i) 
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where {rfe} are the jumps constructed in Remark 1.1 and {Uk} is a sequence a sequence oiU{0, 1) i.i.d. 
random variables. In this case, due to assumed for of X{t) the inverse function is given exphcitly by 
A~^(s) — exp(s) — 1. This example is a generahzation of Example 5.1 in [ p^ , as here we admit a time 
dependent intensity of the underlying Poisson process and also time dependent distribution of the marks. 



Taking g{x) 



T = I and X{0) = (0, 0) the exact solution of the corresponding weak approximation 



problem is given by the formula 

E[giXiT))]=\XiO)f 



A(£) = 1 ^— 

l+s 1+T 



The value of the parameters needed in the simulations are MCH = 10, cq — 1.65, iseed = —7, zseed = 
— 101 and tseed = —20. We note that the expected value of the number of jumps points equals A(T) = 
A(l) = log(2) ~ 0.693. Therefore, the computational cost of including the jump times of the process X in 
our discretization is fairly low, see also the sampled values of max in Table 5.3, and it is asymptotically 
negligible as the required accuracy, TOL, tends to zero. 

5.1. Deterministic time step algorithm. First, we perform overkilling runs in order to test how 



realistic is the a posteriori error approximation of the time discretization error described in Theorem 2.3 
The results, shown in Table 5.1, show that the ratio of the computational error and its computable 
approximation tends to 1 as the number of uniform time steps N increases. For each value of N, we 
choose the number of realizations M large enough in order to keep the total statistical error at the level 
of 1% of the size of the obtained approximation of the time discretization error. 



Alg. D 


Co = 1.65 


iseed = —7 




r> Et-\-Es-\-Ets 


N 


M 




Es + Ets 


[min{A, i?}, max{yl, B}] 


5 


10 X 10^ 


-0.0602 


5.87 X 10-4 


[1.026, 1.046] 


10 


50 X 10^ 


-0.0314 


2.33 X 10-^ 


[1.019, 1.035] 


20 


100 X 10^ 


-0.0159 


1.54 X 10-"* 


[1.008, 1.028] 



TABLE 5.1. Computing the efficient index of the Algorithm D. 

Table 5.2 contains the results of the Algorithm D with an adaptive choice of both the deterministic 
time steps and the number of realizations. The program starts with M — 100 and A^ = 5 subintervals 
as an initial uniform partition of the time interval [0, T] = [0,1]. The tolerance TOL = 0.02 is divided 
into TOLs 0.01333, TOL^ = 0.00444 and TOL„ = 0.00222. When the algorithm stops, the size of 
the total approximation error is less than 2T0L according to the stopping criterion (4.16) and it agrees 
with the size of the computational error. 
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Alg. D 




c„ = 1.65 


MCH = 10 


iseed=-7 


TOL = 0.02 




Iter. 


N 


M 


£c 


Ej- 


Ets 


Es 


1 


5 


100 


-0.03272 


-0.05889 


0.02112 


0.14602 


2 


5 


1000 


-0.04844 


-0.06005 


0.00684 


0.04835 


3 


5 


10000 


-0.06592 


-0.06070 


0.00232 


0.01675 


4 


5 


12088 


-0.05215 


-0.05925 


0.00204 


0.01459 


5 


10 


12088 


-0.03738 


-0.03196 


0.00107 


0.01403 


6 


20 


12088 


-0.02585 


-0.01633 


0.00054 


0.01369 


7 


20 


14122 


-0.02559 






0.01261 



TABLE 5.2. Adaptive choice of M and At with Algorithm D. 



5.2. Stochastic time step algorithm. To observe the performance of the stochastic time steps Algo- 
rithm S, we apply it for different values of TOL, starting with a number of realizations M = 100 and a 
number of uniform time steps N — 5. Table 5.3 contains the obtained results which show that Algorithm 
S is also effective in giving us an approximation of the quantity of interest within the margin of 2T0L 
due to the criterion (4.16). 



TOL 


M 


A{N^;M) 


min A^^ 


maxiV_4 


S{N,;M) 


max 


Es 


Sc 


0.040 


3.2x10^ 


8.3 


5 


41 


5.2 


6 


2.5x10"^ 


-2.2x10"^ 


0.020 


12.0x10^ 


10.9 


5 


73 


9.2 


6 


1.3x10-^ 


-6.0x10"^ 


0.010 


47.9x10^ 


20.5 


10 


146 


16.9 


6 


6.6x10"^ 


-1.0x10"^ 


0.005 


186.4x10^ 


30.1 


10 


193 


32.2 


7 


3.3x10--' 


-3.2 xlO-'* 



TABLE 5.3. Adaptive choice of M and At with Algorithm S. 
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Appendix A. Discrete Dual Equations 

This appendix section is dedicated to the dete rmination of the discrete dual functions (p{t) e M.'^, 
ip'{t) eR'^'"^ and ip"(t) ^Rdy d^d (see Theorem U and Theorem U), where t is a node of the (stochastic) 
partition of the time interval [0, T] which is used by the Eulcr method (see Section (L2)). First, introduce 
the auxiliary functions Ai and q, defined by 

A,{tn,x) = + Atna^{tn,x) + AW^bl{tn,x) ^ X G , i = l,...,d, 
Ci{t, X, z) = Xi + Ci{t, X, z) Vx e R'', V z e Z, i — 1, . . . ,d. 



Then, for each realization, ip, (p' and if" are constructed by the following algorithm: 
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Dual backward time stepping algorithm. 




Set the initial backward values 




¥^iitNA) = dig {X [In a)). y^'tkitNA) = d^kg{X{tNA) 

[ 1\. . J_ 1 

fikrnitNA) = dikmg{X{tNA))- 




ior Na-1,...,0 




if {tn+i is a jump time) then set = (i^^;^, X(t^^i), Z„+i) 


and 


^.(i,T+i)=a,?,(P"+i)^,(t„+i), 




^^kitn+l) - dkMT'''''') V'j,{tn+l) + d.kM-P'' 






(A.2) + a,„,B,-(7'"+l) dkZp{V'^+^) f',p{tn+l) 




+ 9,c,(7'"+i) afc™Bp(7'"+i) ^;p(t„+i) 








+ d,kmCj{r''+')iPj{tn+i), 




else set 




(A.3) (p,(t;^+i) = (^,(t„+i), ¥'y(t^+i) = V'ij(in+l), V'i/cml^rT+l) 


— ^'ikmitn+l)- 


end- if 




Set V'' = {tn,X{tn)) and 




(/.,(t„) = 9,l,(^")^,(i-+i), 




<^:,(i„) = a.A,(P") 9^1,(7^") v-p{t-+i) + d.kMT^n ^dtn 




V'lkmitn) - d.A,{V-) dkMVn dmMV'') v'^pAtn+l) 
+ d,rnMVndkMrnv'jpitn+l) 




+ da,{vndkmMrnv'jpitn+i) 




+ d,kA,{vn d„ApiT^l f'jpitn+i) + dikmA,{V") 




end-for 





Observe that the estimate in Theorem |2.3| needs only the computation of tp and ip' and it is therefore 
less expensive per realization. Also, with respect to the actual implementation of the dual backward time 



stepping it is useful to notice that the blocks (A.2) and (A. 4) differ only in the function call c and A 
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